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Optimal Transport Maps in 
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Abstract 

In the first part of the paper we briefly decribe the classical problem, 
raised by Monge in 1781, of optimal transportation of mass. We discuss also 
Kantorovich's weak solution of the problem, which leads to general existence 
results, to a dual formulation, and to necessary and sufficient optimality con- 
ditions. 

In the second part we describe some recent progress on the problem of the 
existence of optimal transport maps. We show that in several cases optimal 
transport maps can be obtained by a singular perturbation technique based 
on the theory of T-convergence, which yields as a byproduct existence and 
stability results for classical Monge solutions. 
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1. The optimal transport problem and its weak for- 
mulation 

In 1781, G. Monge raised in 125] the problem of transporting a given distribu- 
tion of matter (a pile of sand for instance) into another (an excavation for instance) 
in such a way that the work done is minimal. Denoting by ho, hi : R 2 — -> [0, +oo) 
the Borel functions describing the initial and final distribution of matter, there is 
obviously a compatibility condition, that the total mass is the same: 

/ h (x)dx= / h 1 (y)dy. (1.1) 
Jr 2 Jr. 2 

Assuming with no loss of generality that the total mass is 1, we say that a Borel 
map ip : R 2 — > R 2 is a transport if a local version of the balance of mass condition 
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holds, namely 

/ ho(x)dx= / h 1 (y)dy for any E C R 2 Borel. (1.2) 

Ji>- 1 (E) Je 

Then, the Monge problem consists in minimizing the work of transportation in the 
class of transports, i.e. 



inf <.J \ip(x) — x\ho(x) dx : ip transport j . (1-3) 

The Monge transport problem can be easily generalized in many directions, 
and all these generalizations have proved to be quite useful: 

• General measurable spaces X, Y, with measurable maps ip : X — > Y; 

• General probability measures /i in X and v in Y. In this case the local balance 
of mass condition (|1.2|l reads as follows: 

v(E) = ^,(ip~ 1 (E)) for any E (ZY measurable. (1.4) 

This means that the push- forward operator ipg induced by ip, mapping probability 
measures in X into probability measures in Y, maps /i into v. 

• General cost functions: a measurable map c : X x Y — > [0, +oo]. In this case the 
cost to be minimized is 

W(ip) := / c(x,ip(x)) dfi(x). 
Jx 

Even in Euclidean spaces, the problem of existence of optimal transport maps 
is far from being trivial, mainly due to the non-linearity with respect to ip of the 
condition ip#n = v. In particular the class of transports is not closed with respect to 
any reasonable weak topology. Furthermore, it is easy to build examples where the 
Monge problem is ill-posed simply because there is no transport map: this happens 
for instance when /i is a Dirac mass and v is not a Dirac mass. 

In order to overcome these difficulties, in 1942 L.V.Kantorovich proposed in 
|2T) a notion of weak solution of the transport problem. He suggested to look for 
plans instead of transports, i.e. probability measures 7 in X x Y whose marginals 
are fj, and v. Formally this means that 7Tx#7 = ^ and kyh=1 = u i where ttx '■ 
X x Y — > X and 7ry : X x Y — > Y are the canonical projections. Denoting by 
II(/i, v) the class of plans, he wrote the following minimization problem 

mm <; / c(x,y)dj; j£JI(n,u)>. (1.5) 

'XxY J 

Notice that II(/i, v) is not empty, as the product /i<g>v has n and v as marginals. Due 
to the convexity of the new constraint 7 S n(/i, v) it turns out that weak topologies 
can be effectively used to provide existence of solutions to (|1.5(l : this happens for 
instance whenever X and Y are Polish spaces and c is lower semicontinuous (see for 
instance |28p. Notice also that, by convexity of the energy, the infimum is attained 
on a extremal element of Tl(p, v). 
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The connection between the Kantorovich formulation of the transport problem 
and Monge's original one can be seen noticing that any transport map ip induces 
a planning 7, defined by (Id x ^ )#/•*■ This planning is concentrated on the graph 
of ip in X X Y and it is easy to show that the converse holds, i.e. whenever 7 is 
concentrated on a graph, then 7 is induced by a transport map. Since any transport 
induces a planning with the same cost, it turns out that 



Moreover, by approximating any plan by plans induced by transports, it can be 
shown that equality holds under fairly general assumptions (see for instance [3]). 
Therefore we can really consider the Kantorovich formulation of the transport prob- 
lem as a weak formulation of the original problem. 

If all extremal points of II(/z, v) were induced by transports one would get 
existence of transport maps directly from the Kantorovich formulation. It is not 
difficult to show that plannings 7 induced by transports are extremal in II(/i, v). 
The converse holds in some very particular cases, but unfortunately it is not true 
in general. It turns out that the existence of optimal transport maps depends not 
only on the geometry of n(/i, v), but also (in a quite sensible way) on the choice of 
the cost function c. 

2. Existence of optimal transport maps 

In this section we focus on the problem of the existence of optimal transport 
maps in the sense of Monge. Before discussing in detail in the next sections the two 
model cases in which the cost function is the square of a distance or a distance (we 
refer to |19j for the case of concave functions of the distance, not discussed here), 
it is better to give an informal description of the tools by now available for proving 
the existence of optimal transport maps. 

Strategy A (Dual formulation). This strategy is based on the duality formula 



where the supremum runs among all pairs (h, k) G L ■ (fi) x L 1 ^) such that h(x) + 
k{y) < c(x,y). The duality approach to the (MK) problem was developed by 
Kantorovich, and then extended to more general cost functions (see |22]). The 
transport map is obtained from an optimal pair (h, k) in the dual formulation by 
making a first variation. This strategy for proving the existence of an optimal 
transport map goes back to the papers ^Hj and ^I] • 

Strategy B (Cyclical monotonicity). In some situations the necessary (and suf- 
ficient) minimality conditions for the primal problem, based upon the so-called 
c-cyclical monotonicity ([221, |2H1j EH| ) yield that any optimal Kantorovich solu- 
tion 7 is concentrated on a graph T (i.e. for /i-a.e. x there exists a unique y such 
that (x,y) € r) and therefore is induced by a transport ip. 



inf fDty > min (fL5|l . 




(2.6) 
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This happens for instance when c(x,y) = H(x — y), with H strictly convex in 
R". This approach is pursued in the papers [3"U|. 

Strategy C (Singular perturbation with strictly convex costs). One can try to get 
an optimal transport map by making the cost strictly convex through a perturbation 
and then passing to the limit (see ^2 and Theorem 14.11 Theorem 14. 21 below^l . The 
main difficulty is to show (strong) convergence at the level of the transport maps 
and not only at the level of transport plans. 

Strategy D (Reduction to a lower dimensional problem). This strategy has been 
initiated by V.N.Sudakov in 33 j. It consists in writing (typically through a disinte- 
gration) /i and v as the superposition of measures concentrated on lower dimensional 
sets and in solving the lower dimensional transport problems, trying in the end to 
"glue" all the partial transport maps into a single transport map. This strategy is 
discussed in detail in and used, together with a "variational" decomposition, in 
j^j. The simplest case is when the lower dimensional problems are 1-dimensional, 
since the solution of the 1-dimensional transport problem is simply given by an in- 
creasing rearrangement, at least for convex functions of the distance (see for instance 

El, EH, ESI) 

Strategies A and B are basically equivalent and yield existence and uniqueness 
at the same time: the first one could be preferable for someone, as a very small 
measure-theoretic apparatus is involved. On the other hand, it strongly depends on 
the existence of maximizing pairs in the dual formulation, and this existence issue 
can be more subtle than the existence issue for the primal problem (see |28| and 
the discussion in 0). For this reason it seems that the second strategy can work 
for more general classes of cost functions. 

Strategies C and D have been devised to deal with situations where the cost 
function is convex but not strictly convex. Also these two strategies are closely 
related, as the strictly convex perturbation often leads to an effective dimension 
reduction of the problem (see for instance [H])- 

3. cost=distance 2 

In this section we consider the case when X = Y and the cost function c 
is proportional to the square of a distance d. For convenience we normalize c so 
that c = d 2 /2. The first result in the Euclidean space R" has been discovered 
independently by many authors Y.Brenier [8], [§], S.T.Rachev and L.R.iischendorf 
E3, |22|, and C.Smith and M.Knott |3T]. 

Theorem 3.1 Assume that (i is absolutely continuous with respect to C n and that 
[i and v have finite second order moments. Then there exists a unique optimal 
transport map ip. Moreover ip is the gradient of a convex function. 

In this case the proof comes from the fact that both strategies A and B yield 
that the displacement x — ip(x) is the gradient of a c-concave function, i.e. a function 
representable as 

h(x) = inf c(x, y)+t VieR" 
(.y,t)ei 
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for a suitable non-empty set I C Y x R. The concept of c-concavity [25] has been 
extensively used to develop a very general duality theory for the (MK) problem, 
based on l|2.6[l . In this special Euclidean situation it is immediate to realize that 
c-concavity of h is equivalent to concavity (in the classical sense) oi h — ^\x\ 2 , hence 



tp(x) = x- Vh(x) = V 



~\x\ 2 -h(x) 



is the gradient of a convex function. Finally, notice that the assumption on \x can 
be sharpened (see ^2]), assuming for instance that fx{B) = whenever B has finite 
HJ 1 - ^measure. This is due to the fact that the non-differentiability set of a concave 
function is a- finite with respect to 7i" _1 (see for instance Also the assumption 
about second order moments can be relaxed, assuming only that the infimum of the 
(MK) problem with data fi, v is finite. 

The following result, due to R.Mc Cann [35|, is much more recent. 

Theorem 3.2 Assume that M is a C 3 , complete Riemannian manifold with no 
boundary and d is the Riemannian distance. If [i, v have finite second order mo- 
ments and fi is absolutely continuous with respect to voIm there exists a unique 
optimal transport map ip. 

Moreover there exists a c-concave potential h : M — ► R such that 

tp(x) = exp^, (— Wh(x)) volM-a.e.. 

This Riemannian extension of Theorem 13. II is non trivial, due to the fact that 
d 2 is not smooth in the large. The proof uses some semiconcavity estimates for d 2 
and the fact that d 2 is C 2 for x close to y (this is where the C 3 assumption on 
M is needed). It is interesting to notice that the results of |2U (where the eikonal 
equation is read in local coordinates), based on the theory of viscosity solutions — 
see in particular Theorem 5.3 of [22| — allow to push Mc Cann's technique up to 
C 2 manifolds. 

Can we go beyond Riemannian manifolds in the existence theory? A model 
case is given by stratified Carnot groups endowed with the Carnot-Caratheodory 
metric dec-, as these spaces arise in a very natural way as limits of Riemannian 
manifolds with respect to the Gromov-Hausdorff convergence (see [10]). At this 
moment a general strategy is still missing, but some preliminary investigations in 
the Heisenberg group H n show that positive results analogous to the Riemannian 
ones can be expected. The following result is proved in ^6 ; : 

Theorem 3.3 If n = 1, 2 and \i is a probability measure in H n absolutely contin- 
uous with respect to C 2n+1 , then: 

(a) there exists a unique optimal transport map ip, deriving from a c-concave poten- 
tial h; 

(b) If dp 1 dec ar & Riemannian left invariant metrics then Mc Cann's optimal 
transport maps tp p relative to c p = d 2 /2 converge in measure to ip as p — > oo. 

The restriction to H n , n < 2, arises from the fact that so far we have been able 
to carry on some explicit computations only for n < 2. We expect that this restric- 
tion could be removed. The proof of (b) is not direct, as Mc Cann's exponential 
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representation ip p = exp^(— V p /i p ) "degenerates" as p — > oo, because the injectivity 
radius of the approximating manifolds tends to 0. This is due to the fact that in 
CC metric spaces geodesies exist but are not unique, not even in the small. 

Finally, if we replace c by the square of the Koranyi norm (related to the 
fundamental solution of the Kohn sub-Laplacian) , namely 

c(a; > y):=i||i/- 1 a: [| a with ||(M)||:= 4 ^7F+^ 

(here we identify H n with C" x R) then we are still able to prove existence in any 
Heisenberg group H n . The proof uses some fine properties of BV functions on sub- 
Riemannian groups [2]. However, we can't hope for a Riemannian approximation 
result, as the Koranyi norm induces a metric dx which is not geodesic. It turns out 
that the geodesic metric associated to dx is a constant multiple of dec- 

4. cost = distance 

In this section we consider the case when X — Y and the cost function c is 
a distance. In this case both strategies A and B give only a partial information 
about the location of y, for given x. In particular it is not true that any optimal 
Kantorovich plan 7 is induced by a transport map. Indeed, if the first order moments 
of \i and v are finite, the dual formulation provides us with a maximizing pair 
{h,k) = (u,—u), with u : X — > R 1-Lipschitz. If X = R™ and the distance is 
induced by a norm || • ||, this provides the implication 

(x,y)espt7 => y <E {x - s£ : £ 6 (du(x))* , s > 0} (4.7) 

at any differentiability point of u. Here we consider the natural duality map between 
covectors and vectors given by 

L*:={£eR": L(0 = and ||£|| = 1}. 

The most favourable case is when the norm is strictly convex (e.g. the Euclidean 
norm): in this situation the * operator is single- valued and we recover from (|4.7|l 
an information on the direction of transportation, i.e. (du(x))* , but not on the 
length of transportation. If the norm is not strictly convex (e.g. the l\ or norm) 
then even the information on the direction of transportation, encoded in (du(x))* , 
is partial. 

The first attempt to bypass these difficulties came with the work of V.N.Sudakov 
|33| . who claimed to have a solution for any distance cost function induced by a 
norm. Sudakov's approach is based on a clever decomposition of the space R" in 
affine regions with variable dimension where the Kantorovich dual potential u as- 
sociated to the transport problem is an affine function. His strategy is to solve the 
transport problem in any of these regions, eventually getting an optimal transport 
map just by gluing all these transport maps. An essential ingredient in his proof is 
Proposition 78, where he states that, if /i << C n , then the conditional measures in- 
duced by the decomposition are absolutely continuous with respect to the Lebesgue 
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measure (of the correct dimension). However, it turns out that this property is not 
true in general even for the simplest decomposition, i.e. the decomposition in seg- 
ments: G.Alberti, B.Kirchheim and D.Preiss found an example of a compact faily 
of pairwise disjoint open segments in R 3 such that the family M of their midpoints 
has strictly positive Lebesgue measure (the construction is a variant of previous 
examples due to A.S.Besicovitch and D.G.Larman, see also [2] and In this 

case, choosing /i = £ 3 LM, the conditional measures induced by the decomposition 
are Dirac masses. Therefore it is clear that this kind of counterexamples should be 
ruled out by some kind of additional "regularity" property of the decomposition. 
In this way the Sudakov strategy would be fully rigorous. As noticed in 0, this 
regularity comes for free only in the case n = 2, using the fact that transport rays 
do not cross in their interior. 

Several years later, L.C.Evans and W.Gangbo made a remarkable progress in 
|15| . showing by differential methods the existence of a transport map, under the 
assumption that spt \x PI spt v = 0, that the two measures are absolutely continuous 
with respect to C n and that their densities are Lipschitz functions with compact 
support. The missing piece of information about the length of transportation is 
recovered by a p-laplacian approximation 

-div (|Vu| p ~ 2 Vu) = fi - v, u G Hq(Br), i?>l 

obtaining in the limit as p — *■ +oo a nonnegative function a G L°°(R n ) and a 
1-Lipschitz function u solving 

—div (oVm) = fi — v, |Vm| = 1 £"-a.e. on {a > 0}. 

The diffusion coefficient a in the PDE above plays a special role in the theory. 
Indeed, one can show (see [2]) that the measure a := aC n , the so-called transport 
density, can be represented in several different way, and in particular as 

a{B) = J H 1 {BO [x, y\) d~f(x, y) \/B C R n Borel (4.8) 

for some optimal planning 7. Notice that the total mass of a is J \x — y\d^y, the 
total work done and the meaning of o~(B) is the work done within B during the 
transport process. This representation of the transport density has been introduced 
by G.Bouchitte and G.Buttazzo in |7|, who showed that the a constant multiple of 
the transport density is a solution of their so-called mass optimization problem. 
Later, in 0, it was shown that there is actually a 1-1 correspondence between 
solutions of the mass optimization problem and transport densities, defined as in 

One can also show ([2], |lt>| . |14p that a is unique (unlike 7) if either \x 
or v are absolutely continuous. Moreover, the nonlinear operator mapping (/x, v) G 
L 1 x L 1 into a G L 1 maps L p x LP into LP for 1 < p < 00. 

Coming back to the problem of the existence of optimal transport maps with 
Euclidean distance \x — y\ (or, more generally, with a distance induced by a C 2 
and uniformly convex norm), the first existence results for general absolutely con- 
tinuous measures /i, v with compact support have been independently obtained by 
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L.Caffarclli, M.Feldman and R.Mc Cann in J2| an d by N.Trudinger and L.Wang 
in |34|. Afterwards, the author estabilished in [2] the existence of an optimal trans- 
port map assuming only that the initial measure \i is absolutely continuous, and 
the results of ^2] and jS] have been extended to a Riemannian setting in JJj. All 
these proofs involve basically a Sudakov decomposition in transport rays, but the 
technical implementation of the idea is different from paper to paper: for instance 
in ^21 a local change of variable is made, so that transport rays become parallel 
and Fubini theorem, in place of abstract disintegration theorems for measures, can 
be used. The proof in |Hj , instead, uses the co-area formula to show that absolute 
continuity with respect to Lebesgue measure is stable under disintegration. 

The following result is a slight improvement of J2|> where existence of an 
optimal transport map was estabilished but not the stability property. The result 
holds under regularity and uniform convexity assumptions for the norm || • ||. 

Theorem 4.1 Let [i,v he with compact support, with n << C n , and let tp e he the 
unique optimal transport maps relative to the costs c e (x,y) := ||x — Then tpe 

converge as e I to an optimal transport map ip for c(x, y) — \\x — y\\. 

The proof is based only the fact that any plan 70, limit of some sequence 
of plans [Id x ip ei ), is not only optimal for the (MK) problem, but also for the 
secondary one 



where IIi(/i, v) denotes the class of all optimal plannings for the Kantorovich prob- 
lem (the entropy function in l|4.9[) comes from the Taylor expansion of c e around 
e = 0). It turns out that this additional minimality property selects a unique plan 
induced by a transport ip and, a posteriori, ip is the same map built in [12] . A class 
of counterexamples built in 3 shows that the absolute continuity assumption on /i 
cannot be weakened, unlike the strictly convex case. 

This "variational" procedure seems to select extremal elements of II(/i, v) in 
a very effective way. This phenomenon is apparent in view of the following result 
jS], which holds for all "crystalline" norms || • || (i.e. norms whose unit sphere is 
contained in finitely many hyperplanes) . 

Theorem 4.2 Let fi,u he as in Theorem and let ip e be the unique optimal 
transport maps relative to the costs 



Then ip e converge as e J, to an optimal transport map ip for c(x,y) = \\x — y\\. 

In this case a secondary and a ternary variational problem are involved, and 
we show that the latter has a unique solution which is also induced by a transport. 

Some borderline cases between "crystalline" norms and "Euclidean" norms 
apparently can't be attacked by any of the existing techniques. In particular the 
existence of optimal transport maps for the cost induced by a general norm in R", 
n > 3, is still open. 




(4.9) 



c e (x,y) := ||.T-y||+e|a;-y|+e 2 |a;-y|ln|a;-y 
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